PASTASpark: multiple sequence alignment meets Big Data
نویسندگان
چکیده
منابع مشابه
PASTASpark: multiple sequence alignment meets Big Data
Motivation One basic step in many bioinformatics analyses is the multiple sequence alignment. One of the state-of-the-art tools to perform multiple sequence alignment is PASTA (Practical Alignments using SATé and TrAnsitivity). PASTA supports multithreading but it is limited to process datasets on shared memory systems. In this work we introduce PASTASpark, a tool that uses the Big Data engine ...
متن کاملMultiple Sequence Alignment Multiple Sequence Alignment
An algorithm for progressive multiple alignment of sequences with insertions " , 1. Introduction The problem of sequence alignment is to find the patterns of sequence conservation and similarity between pairs or sets of given sequences. In biological contexts, similarity between biological sequences usually amounts to either functional or structural similarities or divergence from a common ance...
متن کاملMultiple sequence alignment.
Multiple sequence alignments are an essential tool for protein structure and function prediction, phylogeny inference and other common tasks in sequence analysis. Recently developed systems have advanced the state of the art with respect to accuracy, ability to scale to thousands of proteins and flexibility in comparing proteins that do not share the same domain architecture. New multiple align...
متن کاملMultiple protein sequence alignment.
Multiple sequence alignments are essential in computational analysis of protein sequences and structures, with applications in structure modeling, functional site prediction, phylogenetic analysis and sequence database searching. Constructing accurate multiple alignments for divergent protein sequences remains a difficult computational task, and alignment speed becomes an issue for large sequen...
متن کاملContextual Multiple Sequence Alignment
In a recently proposed contextual alignment model, efficient algorithms exist for global and local pairwise alignment of protein sequences. Preliminary results obtained for biological data are very promising. Our main motivation was to adopt the idea of context dependency to the multiple-alignment setting. To this aim the relaxation of the model was developed (we call this new model averaged co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2017
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/btx354